Learning with ensembles: How overfitting can be useful

نویسندگان

  • Peter Sollich
  • Anders Krogh
چکیده

We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fit the training data. Globally optimal performance can be obtained by choosing the training set sizes of the students appropriately. For smaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of noise in the training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overfitting cautious selection of classifier ensembles with genetic algorithms

Information fusion research has recently focused on the characteristics of the decision profiles of ensemble members in order to optimize performance. These characteristics are particularly important in the selection of ensemble members. However, even though the control of overfitting is a challenge in machine learning problems, much less work has been devoted to the control of overfitting in s...

متن کامل

Checkpoint Ensembles: Ensemble Methods from a Single Training Process

We present the checkpoint ensembles method that can learn ensemble models on a single training process. Although checkpoint ensembles can be applied to any parametric iterative learning technique, here we focus on neural networks. Neural networks’ composable and simple neurons make it possible to capture many individual and interaction effects among features. However, small sample sizes and sam...

متن کامل

The Bias Variance Trade-Off in Bootstrapped Error Correcting Output Code Ensembles

By performing experiments on publicly available multi-class datasets we examine the effect of bootstrapping on the bias/variance behaviour of error-correcting output code ensembles. We present evidence to show that the general trend is for bootstrapping to reduce variance but to slightly increase bias error. This generally leads to an improvement in the lowest attainable ensemble error, however...

متن کامل

SVM Ensembles Are Better When Different Kernel Types Are Combined

Support Vector Machines (SVM) are strong classifiers, but large data sets might lead to prohibitively long computation times and high memory requirements. SVM ensembles, where each single SVM sees only a fraction of the data, can be an approach to overcome this barrier. In continuation of related work in this field we construct SVM ensembles with Bagging and Boosting. As a new idea we analyze S...

متن کامل

Research on Decision Forest Learning Algorithm

Decision Forests are investigated for their ability to provide insight into the confidence associated with each prediction, the ensembles increase predictive accuracy over the individual decision tree model established. This paper proposed a novel “bottom-top” (BT) searching strategy to learn tree structure by combining different branches with the same root, and new branches can be created to o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995